Fine-Grained Multi-label Sexism Classification Using a Semi-Supervised Multi-level Neural Approach
نویسندگان
چکیده
Abstract Sexism, a permeate form of oppression, causes profound suffering through various manifestations. Given the increasing number experiences sexism shared online, categorizing these recollections automatically can support battle against sexism, since it promote successful evaluations by gender studies researchers and government representatives engaged in policy making. In this paper, we examine fine-grained, multi-label classification accounts (reports) sexism. To best our knowledge, consider substantially more categories than any related prior work 23-class problem formulation. Moreover, present first semi-supervised for describing type(s) We devise self-training-based techniques tailor-made nature to utilize unlabeled samples augmenting labeled set. identify high textual diversity with respect existing set as desirable quality candidate instances develop methods incorporating into approach. also explore ways infusing class imbalance alleviation learning, independently conjunction method involving diversity. addition data augmentation methods, neural model which combines biLSTM attention domain-adapted BERT an end-to-end trainable manner. Further, formulate multi-level training approach models are sequentially trained using different levels granularity. loss function that exploits label confidence scores associated data. Several proposed outperform baselines on recently released dataset categorization across several standard metrics.
منابع مشابه
Semi-supervised Learning for Multi-label Classification
In this report we consider the semi-supervised learning problem for multi-label image classification, aiming at effectively taking advantage of both labeled and unlabeled training data in the training process. In particular, we implement and analyze various semi-supervised learning approaches including a support vector machine (SVM) method facilitated by principal component analysis (PCA), and ...
متن کاملMulti-label ASRS Dataset Classification Using Semi Supervised Subspace Clustering
There has been a lot of research targeting text classification. Many of them focus on a particular characteristic of text data multi-labelity. This arises due to the fact that a document may be associated with multiple classes at the same time. The consequence of such a characteristic is the low performance of traditional binary or multi-class classification techniques on multi-label text data....
متن کاملSemi-supervised Latent Dirichlet Allocation for Multi-label Text Classification
This paper proposes a semi-supervised latent Dirichlet allocation (ssLDA) method, which differs from the existing supervised topic models for multi-label classification in mainly two aspects. Firstly both labeled and unlabeled learning data are used in ssLDA to train a model, which is very important for reducing the cost by manually labeling, especially when obtaining a fully labeled dataset is...
متن کاملSemi-Supervised Dimension Reduction for Multi-Label Classification
A significant challenge to make learning techniques more suitable for general purpose use in AI is to move beyond i) complete supervision, ii) low dimensional data and iii) a single label per instance. Solving this challenge would allow making predictions for high dimensional large dataset with multiple (but possibly incomplete) labelings. While other work has addressed each of these problems s...
متن کاملFine-Grained Emotion Detection in Suicide Notes: A Thresholding Approach to Multi-Label Classification
We present a system to automatically identify emotion-carrying sentences in suicide notes and to detect the specific fine-grained emotion conveyed. With this system, we competed in Track 2 of the 2011 Medical NLP Challenge,14 where the task was to distinguish between fifteen emotion labels, from guilt, sorrow, and hopelessness to hopefulness and happiness.Since a sentence can be annotated with ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Data Science and Engineering
سال: 2021
ISSN: ['2364-1541', '2364-1185']
DOI: https://doi.org/10.1007/s41019-021-00168-y